% Copyright 2011-2012 David Hadka.  All Rights Reserved.
%
% This file is part of the MOEA Framework User Manual.
%
% Permission is granted to copy, distribute and/or modify this document under
% the terms of the GNU Free Documentation License, Version 1.3 or any later
% version published by the Free Software Foundation; with the Invariant Section
% being the section entitled "Preface", no Front-Cover Texts, and no Back-Cover
% Texts.  A copy of the license is included in the section entitled "GNU Free
% Documentation License".

\chapter{Executor, Instrumenter, Analyzer}

The Executor, Instrumenter and Analyzer classes provide most of the functionality provided by the MOEA Framework.  This chapter walks through several introductory examples using these classes.

\section{Executor}
The Executor class is responsible for constructing and executing runs of an algorithm.  A single run requires three pieces of information: 1) the problem; 2) the algorithm used to solve the problem; and 3) the number of objective function evaluations allocated to solve the problem.  The following code snippet demonstrates how these three pieces of information are passed to the Executor.

\begin{lstlisting}[language=Java]
NondominatedPopulation result = new Executor()
		.withProblem("UF1")
		.withAlgorithm("NSGAII")
		.withMaxEvaluations(10000)
		.run();
\end{lstlisting}

Line 1 creates a new Executor instance.  Lines 2, 3 and 4 set the problem, algorithm and maximum number of objective function evaluations, respectively.  In this example, we are solving the two-objective UF1 test problem using the NSGA-II algorithm.  Lastly, line 5 runs this experiment and returns the resulting approximation set.

Note how, on lines 2 and 3, the problem and algorithm are specified using strings.  There exist a number of pre-defined problems and algorithms which are available in the MOEA Framework.  Furthermore, the MOEA Framework can be easily extended to provide additional problems and algorithms which can be instantiated in a similar manner.

Once the run is complete, we can access the result.  Line 1 above shows that the approximation set, which is a \texttt{NondominatedPopulation}, gets assigned to a variable named result.  This approximation set contains all Pareto optimal solutions produced by the algorithm during the run.  For example, the code snippet below prints out the two objectives to the console.

\begin{lstlisting}[language=Java]
for (Solution solution : result) {
	System.out.println(solution.getObjective(0) + " " +
			solution.getObjective(1));
}
\end{lstlisting}

Line 1 loops over each solution in the result.  A solution stores the decision variables, objectives, constraints and any relevant attributes.  Lines 2 and 3 demonstrate how the objective values can be retrieved from a solution.

Putting all this together and adding the necessary boilerplate Java code, the complete code snippet is shown below.

\begin{lstlisting}[language=Java]
import java.util.Arrays;
import org.moeaframework.Executor;
import org.moeaframework.core.NondominatedPopulation;
import org.moeaframework.core.Solution;

public class Example1 {

	public static void main(String[] args) {
		NondominatedPopulation result = new Executor()
				.withProblem("UF1")
				.withAlgorithm("NSGAII")
				.withMaxEvaluations(10000)
				.run();

		for (Solution solution : result) {
			System.out.println(solution.getObjective(0) 
					+ " " + solution.getObjective(1));
		}
	}

}
\end{lstlisting}

Since line 6 defines the class name to be \texttt{Example1}, this code snippet must be saved to the file \file{Example1.java}.  Compiling and running this file will produce output similar to:

\begin{lstlisting}[language=Plaintext]
0.44231379762506046 0.359116256460771
0.49221581122496827 0.329972177772519
0.8024234727753593 0.11620643510507386
0.7754123383456428 0.14631335018878214
0.4417794310706653 0.3725544250337767
0.11855110357018901 0.6715178312971422
...
\end{lstlisting}

The \texttt{withProblem} and \texttt{withAlgorithm} methods allow us to specify the problem and algorithm by name.  Changing the problem or algorithm is as simple as changing the string passed to these methods.  For example, replacing line 11 in the code snippet above to \texttt{withAlgorithm(``GDE3'')} will now allow you to run the Generalized Differential Evolution 3 (GDE3) algorithm.  Most existing MOEA libraries require users to instantiate and configure each algorithm manually.  In the MOEA Framework, such details are handled by the Executor.

The MOEA Framework is distributed with a number of built-in problems and algorithms.  See the API documentation for \texttt{StandardProblems} and \texttt{StandardAlgorithms} to see the list of problems and algorithms available, respectively.

When specifying only the problem, algorithm and maximum evaluations, the Executor assumes default parameterizations for the internal components of the algorithm.  For instance, NSGA-II parameters include the population size, the simulated binary crossover (SBX) rate and distribution index, and the polynomial mutation (PM) rate and distribution index.  Changing the parameter settings from their defaults is possible using the \texttt{setProperty} methods.  Each parameter is identified by a key and is assigned a numeric value.  For example, all of NSGA-II's parameters are set in the following code snippet:

\begin{lstlisting}[language=Java]
NondominatedPopulation result = new Executor()
		.withProblem("UF1")
		.withAlgorithm("NSGAII")
		.withMaxEvaluations(10000)
		.withProperty("populationSize", 50)
		.withProperty("sbx.rate", 0.9)
		.withProperty("sbx.distributionIndex", 15.0)
		.withProperty("pm.rate", 0.1)
		.withProperty("pm.distributionIndex", 20.0)
		.run();
\end{lstlisting}

Each algorithm defines its own parameters.  Refer to the API documentation for the exact parameter keys.  Any parameters not explicitly defined using the \texttt{withProperty} methods will be set to their default values.

The Executor also provides some more advanced features.  First, it contains several methods for distribution objective function evaluations across multiple processing cores or computers.  Distributing evaluations can significantly speed up a run through parallelization.  The simplest way to enable parallelization is through the \texttt{distributeOnAllCores} method.  This will distribute objective function evaluations across all cores on your local computer.  Note that not all problems can support parallelization.  In such cases, the MOEA Framework will still run the algorithm, but it will not be parallelized.

Second, the MOEA Framework provides checkpointing functionality.  As an algorithm is running, checkpoint files will be periodically saved.  The checkpoint file stores the current state of the algorithm.  If the run is interrupted, such as a power outage, the run can be resumed at the last saved checkpoint.  The \texttt{setCheckpointFile} sets the file location for the checkpoint file, and \texttt{checkpointEveryIteration} or \texttt{setCheckpointFrequency} control how frequently the checkpoint file is saved.

Resuming a run from a checkpoint occurs automatically.  If the checkpoint file does not exist, a run starts from the beginning.  However, if the checkpoint file exists, then the run is automatically resumed at that checkpoint.  Care must be taken when using checkpoints as they can be a source of confusion for new users.  For instance, using the same checkpoint file from an unrelated run can cause unexpected behavior or an error.  For this reason, checkpoints are recommended only when solving time-consuming problems.

The snippet below extends our example with distributed evaluations and checkpointing.  If running this example multiple times, you will need to delete the checkpoint file to restart from the beginning.

\begin{lstlisting}[language=Java]
NondominatedPopulation result = new Executor()
		.withProblem("UF1")
		.withAlgorithm("NSGAII")
		.withMaxEvaluations(10000)
		.withProperty("populationSize", 50)
		.withProperty("sbx.rate", 0.9)
		.withProperty("sbx.distributionIndex", 15.0)
		.withProperty("pm.rate", 0.1)
		.withProperty("pm.distributionIndex", 20.0)
		.distributeOnAllCores()
		.checkpointEveryIteration()
		.withCheckpointFile(new File("UF1_NSGAII.chkpt"))
		.run();
\end{lstlisting}

\section{Instrumenter}
In addition to supporting a multitude of algorithms and test problems, the MOEA Framework also contains a comprehensive suite of tools for analyzing the performance of algorithms.  The MOEA Framework supports both run-time dynamics and end-of-run analysis.  Run-time dynamics capture the behavior of an algorithm throughout the duration of a run, recording how its solution quality and other elements change.  End-of-run analysis, on the other hand, focuses on the result of a complete run and comparing the relative performance of various algorithms.  Run-time dynamics will be introduced in this section using the Instrumenter; end-of-run analysis will be discussed in the following section using the Analyzer.

The Instrumenter gets its name from its ability to add instrumentation, pieces of code which record information, to an algorithm.  A range of data can be collected using the Instrumenter, including:

\begin{enumerate}
  \item Elapsed time
  \item Population size / archive size
  \item The approximation set
  \item Performance metrics
  \item Restart frequency
\end{enumerate}

The Instrumenter works hand-in-hand with the Executor to collect its data.  The Executor is responsible for configuring and running the algorithm, but it allows the Instrumenter to record the necessary data while the algorithm is running.  To start collecting run-time dynamics, we first create and configure an Instrumenter instance.

\begin{lstlisting}[language=Java]
Instrumenter instrumenter = new Instrumenter()
		.withProblem("UF1")
		.withFrequency(100)
		.attachAll();
\end{lstlisting}

First, line 1 creates the new Instumenter instance.  Next, line 2 specifies the problem.  This allows the instrumenter to access the known reference set for this problem, which is necessary for evaluating performance metrics.  Third, line 3 sets the frequency at which the data is recorded.  In this example, data is recorded every 100 evaluations.  Lastly, line 4 indicates that all available data should be collected.

Next, we must create and configure the Executor instance with the following code snippet:

\begin{lstlisting}[language=Java]
NondominatedPopulation result = new Executor()
		.withProblem("UF1")
		.withAlgorithm("NSGAII")
		.withMaxEvaluations(10000)
		.withInstrumenter(instrumenter)
		.run();
\end{lstlisting}

This code snippet is similar to the previous examples of the Executor, but includes the addition of line 5.  Line 5 tells the executor that all algorithms it executes will be instrumented with our Instrumenter instance.  Once the instrumenter is set and the algorithm is configured, we can run the algorithm on line 6.

Once the run completes, we can now access the data collected by the instrumenter.  The data is stored in an Accumulator object.  The Accumulator for the run we just executed can be retrieved with the following line:

\begin{lstlisting}[language=Java]
Accumulator accumulator = instrumenter.getLastAccumulator();
\end{lstlisting}

An Accumulator is similar to a Map in that it stores key-value pairs.  The key identifies the type of data recorded.  However, instead of storing a single value, the Accumulator stores many values, one for each datapoint collected by the Instrumenter.  The data can be extracted from an Accumulator with a loop similar to the following:

\begin{lstlisting}[language=Java]
for (int i=0; i<accumulator.size("NFE"); i++) {
	System.out.println(accumulator.get("NFE", i) + "\t" +  
			accumulator.get("GenerationalDistance", i));
}
\end{lstlisting}

In this code snippet, we are printing out two columns of data: the number of objective function evaluations (NFE) and the generational distance performance indicator.  Including all the boilerplate Java code, the complete example is provided below.

\begin{lstlisting}[language=Java]
import java.io.IOException;
import java.io.File;
import org.moeaframework.core.NondominatedPopulation;
import org.moeaframework.Instrumenter;
import org.moeaframework.Executor;
import org.moeaframework.analysis.collector.Accumulator;

public class Example2 {

	public static void main(String[] args) throws IOException {
		Instrumenter instrumenter = new Instrumenter()
				.withProblem("UF1")
				.withFrequency(100)
				.attachAll();

		NondominatedPopulation result = new Executor()
				.withProblem("UF1")
				.withAlgorithm("NSGAII")
				.withMaxEvaluations(10000)
				.withInstrumenter(instrumenter)
				.run();

		Accumulator accumulator = instrumenter.getLastAccumulator();

		for (int i=0; i<accumulator.size("NFE"); i++) {
			System.out.println(accumulator.get("NFE", i) + "\t" +  
					accumulator.get("GenerationalDistance", i));
		}
	}

}
\end{lstlisting}

Since line 8 defines the class name to be \texttt{Example2}, this code snippet must be saved to the file \file{Example2.java}.  Compiling and running this file will produce output similar to:

\begin{lstlisting}[language=Plaintext]
100    0.6270289757162074
200    0.5843583367503041
300    0.459146246330144
400    0.36799025371209954
500    0.3202295846482732
600    0.2646381449859231
...
\end{lstlisting}

This shows how NSGA-II experiences rapid convergence early in a run.  In the first 600 evaluations, the generational distance decreases to 33\% of its starting value.  Running for longer, you should see the speed of convergence begin to level off.  This type of curve is commonly seen when using MOEAs.  They often experience a rapid initial convergence that quickly levels off.  You can confirm this behavior by running this example on different algorithms.

There are a few things to keep in mind when using the Instrumenter. First, instrumenting an algorithm and recording all the data will slow down the execution of algorithms and significantly increase their memory usage.  For this reason, we strongly recommend limiting the data collected to what you consider important.  To limit the data collected, simply replace the \texttt{attachAll} method with one or more specific attach methods, such as \texttt{attachGenerationalDistanceCollector}.  Changing the collection frequency is another way to reduce the performance and memory usage.

Second, if the memory usage exceeds that which is allocated to Java, you will receive an \texttt{OutOfMemoryException}.  The immediate solution is to increase the amount of memory allocated to Java by specifying the \texttt{-Xmx} command-line option when starting a Java application.  For example, the following command will launch a Java program with 512 MBs of available memory.

\begin{quote}
  \texttt{java -Xmx512m -classpath ...}
\end{quote}

If you have set the \texttt{-Xmx} option to include all available memory on your computer and you still receive an \texttt{OutOfMemoryException}, then you need to decrease the collection frequency or restrict which data is collected.

Third, the Instrumenter only collects data which is provided by the algorithm.  For instance, an Instrumenter with \texttt{attachAdaptiveTimeContinuationCollector} will work perfectly fine on an algorithm that does support adaptive time continuation.  The Instrumenter examines the composition of the algorithm and only collects data for the components it finds.  This also implies that the Instrumenter will work on any algorithm, including any provided by third-party providers.

\section{Analyzer}
The Analyzer provides end-of-run analysis.  This analysis focuses on the resulting Pareto approximation set and how it compares against a known reference set.  The Analyzer is particularly useful for statistically comparing the results produced by two or more algorithms, or possibly the same algorithm with different parameter settings.  To start using the Analyzer, we first create and configure a new instance, as shown below.

\begin{lstlisting}[language=Java]
Analyzer analyzer = new Analyzer()
		.withProblem("UF1")
		.includeAllMetrics()
		.showStatisticalSignificance();
\end{lstlisting}

First, we construct a new Analyzer on line 1.  Next, on line 2 we set the problem in the same manner as required by the Executor and Instrumenter.  Line 3 tells the Analyzer to evaluate all available performance metrics.  Lastly, line 4 enables the statistical comparison of the results.

Next, the code snippet below shows how the Executor is used to run NSGA-II and GDE3 for 50 replications and store the results in the Analyzer.  Running for multiple replications, or seeds, is important when generating statistical results.

\begin{lstlisting}[language=Java]
Executor executor = new Executor()
		.withProblem("UF1")
		.withMaxEvaluations(10000);

analyzer.addAll("NSGAII",  
		executor.withAlgorithm("NSGAII").runSeeds(50));

analyzer.addAll("GDE3",
		executor.withAlgorithm("GDE3").runSeeds(50));
\end{lstlisting}

Lines 1-3 create the Executor, similar to the previous examples except we do not yet specify the algorithm.  Lines 5-6 run NSGA-II.  Note how we first set the algorithm to NSGA-II, run it for 50 seeds, and add the results from the 50 seeds to the analyzer.  Similarly, on lines 8-9 we run GDE3 and add its results to the analyzer.  Note how lines 5 and 8 pass a string as the first argument to \texttt{addAll}.  This gives a name to the samples collected, which can be any string and not necessarily the algorithm name as is the case in this example.

Lastly, we can display the results of the analysis with the following line.

\begin{lstlisting}[language=Java]
analyzer.printAnalysis();
\end{lstlisting}

Putting all of this together and adding the necessary boilerplate Java code results in:

\begin{lstlisting}[language=Java]
import java.io.IOException;
import org.moeaframework.Analyzer;
import org.moeaframework.Executor;

public class Example3 {

	public static void main(String[] args) throws IOException {
		Analyzer analyzer = new Analyzer()
				.withProblem("UF1")
				.includeAllMetrics()
				.showStatisticalSignificance();

		Executor executor = new Executor()
				.withProblem("UF1")
				.withMaxEvaluations(10000);

		analyzer.addAll("NSGAII",  
				executor.withAlgorithm("NSGAII").runSeeds(50));

		analyzer.addAll("GDE3",
				executor.withAlgorithm("GDE3").runSeeds(50));

		analyzer.printAnalysis();
	}

}
\end{lstlisting}

The output of which will look similar to:

\begin{lstlisting}[language=Plaintext]
GDE3:
    Hypervolume:
        Min: 0.4358713988821755
        Median: 0.50408587891491
        Max: 0.5334024567416756
        Count: 50
        Indifferent: []
...
NSGAII:
    Hypervolume:
        Min: 0.3853879478063566
        Median: 0.49033837779756184
        Max: 0.5355927774923894
        Count: 50
        Indifferent: []
...
\end{lstlisting}

Observe how the results are organized by the indenting.  It starts with the sample names, GDE3 and NSGAII in this example.  The first indentation level shows the different metrics, such as Hypervolume.  The second indentation level shows the various statistics for the metric, such as the minimum, median and maximum value.

The indifferent statistics show the statistical significance of the results.  The Analyzer applies the Mann-Whitney and Kruskal-Wallis tests for statistical significance.  If the medians are significantly different, then the indifferent columns shows empty brackets (i.e., \texttt{[]}).  However, if the medians are indifferent, then the samples which are indifferent will be shown within the brackets.

Statistical significance is important when comparing multiple algorithms, particularly when the results will be reported in scientific papers.  Running an algorithm will likely produce different results each time it is run.  This is because many algorithms are stochastic (include sources of randomness).  Because of this, it is a common practice to run each algorithm for multiple seeds, with each seed using a different random number sequence.  As a result, an algorithm does not produce a single result.  It produces a distribution of results.  When comparing algorithms based on their distributions, it is necessary to use statistical tests.  Statistical tests allow us to determine if two distributions (i.e., two algorithms) are similar or different.  This is exactly what is provided when enabling \texttt{showStatisticalSignificance} and viewing the Indifferent entries in the output from Analyzer.

In the example above, we called \texttt{includeAllMetrics} to include all performance metrics in the analysis.  This includes hypervolume, generational distance, inverted generational distance, spacing, additive $\epsilon$-indicator, maximum Pareto front error and reference set contribution.  It is possible to enable specific metrics by calling their corresponding include method, such as \texttt{includeGenerationalDistance}.

\section{Conclusion}
This chapter introduced three of the high-level classes: the Executor, Instrumenter and Analyzer.  The examples provided show the basics of using these classes, but their functionality is not limited to what was demonstrated.  Readers should explore the API documentation for these classes to discover some of their more sophisticated functionality.